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Abstract 

We consider multiple access communication on a binary input additive white Gaussian noise channel 
using randomly spread code division. For a general class of symmetric distributions for spreading 
coefficients, in the limit of a large number of users, we prove an upper bound on the capacity, which 
matches a formula that Tanaka obtained by using the replica method. We also show concentration 
of various relevant quantities including mutual information, capacity and free energy. The mathe- 
matical methods are quite general and allow us to discuss extensions to other multiuser scenarios. 

1 Introduction 

> , 

Code Division Multiple Access (CDMA) has been a successful scheme for reliable communication between 
multiple users and a common receiver. The scheme consists of K users modulating their information 
sequence by a signature sequence, also known as spreading sequence, of length N and transmitting. 
The number N is sometimes referred to as the spreading gain or the number of chips per sequence. 
The receiver obtains the sum of all transmitted signals and the noise which is assumed to be white and 
55 \ Gaussian (AWGN). 

The achievable rate region (for real valued inputs) with power constraints and optimal decoding has 
been given in [YJ. There it is shown that the achievable rates depend only on the correlation matrix 
of the spreading coefficients. It is well known that these detectors have exponential (in K) complexity. 
Therefore, it is important to analyze the performance under sub-optimal but low-complexity detectors 
like the linear detectors. For a good overview of these detectors we refer to [2]. In [3], the authors 
considered random spreading (spreading sequences are chosen randomly) and analyzed the spectral 
efficiency, defined as the bits per chip that can be reliably transmitted, for these detectors. In the large- 
system limit (K — > oo, A — > oo, = (3) they obtained nice analytical formulas for the spectral efficiency 
and showed that it concentrates. These formulas follow from the known spectrum of large covariance 
matrices. In [I], [5] the authors analyzed the signal to interference ratio for the decorrelator and the 
MMSE receiver and showed that it is asymptotically Gaussian with variance going to zero. 

Now consider the case where the user input is restricted to take only binary values. Not much 
is known in this case except for the spectral efficiency in the case of high SNR which is analyzed in 
[6]. The random matrix techniques used for Gaussian inputs do not apply here because the spectral 
efficiency cannot be written in terms of just the covariance matrix of the spreading sequences. Tanaka [7] 
applied the formal replica method, developed in statistical mechanics, to this problem and conjectured 
the formula for spectral efficiency and bit error rate (BER) for uncoded transmission. These results were 
later extended in [5] to include the case of unequal powers and channel with fading. The replica method 
is non-rigorous but believed to yield exact results for some models in statistical mechanics [9j. More 
recently Montanari and Tse [10] have made progress towards a rigorous derivation of Tanaka's capacity 
formula in a restricted range of parameters. 
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Our main contributions in this paper are twofold. First we prove that Tanaka's formula is an upper 
bound to the capacity for all values of the parameters and second we prove various useful concentration 
theorems in the large-system limit. 

1.1 Statistical Mechanics Approach 

There is a natural connection between various communication systems and statistical mechanics of 
random spin systems, stemming from the fact that often in both systems there is a large number of 
degrees of freedom (bits or spins), interacting locally, in a random environment. So far, there have 
been applications of two important but somewhat complementary approaches of statistical mechanics of 
random systems. 

The first one is the very important but mathematically uncontrolled replica method. The merit of 
this approach is to obtain conjectural but rather explicit formulas for quantities of interest such as, 
free energy, conditional entropy or error probability. In some cases the natural fixed point structure 
embodied in the mean field formulas allows to guess good iterative algorithms. This program has been 
carried out for linear error correcting codes, source coding, multiuser settings like broadcast channel (see 
for example [TT], [T^], [13]) and the case of interest here [7J: randomly spread CDMA with binary inputs. 

The second type of approach aims at a rigorous understanding of the replica formulas and has its 
origins in methods stemming from mathematical physics (see [14, 15J, |9J). For systems whose underlying 
degrees of freedom have Gaussian distribution (Gaussian input symbols or Gaussian spins in continuous 
spin systems) random matrix methods can successfully be employed. However when the degrees of 
freedom are binary (binary information symbols or Ising spins) these seem to fail, but the recently 
developed interpolation method [H],[T5] has had some succesq^J. The basic idea of the interpolation 
method is to study a measure which interpolates between the posterior measure of the ideal decoder and 
a mean field measure. The later can be guessed from the replica formulas and from this perspective the 
replica method is a valuable tool. So far this program has been developed only for linear error correcting 
codes on sparse graphs and binary input symmetric channels [16], [TT] . 

In this paper we develop the interpolation method for the random CDMA system with binary inputs 
(in the large-system limit). The situation is qualitatively different than the ones mentioned above in that 
the "underlying graph" is complete. Superficially one might think that it is similar to the Sherrington- 
Kirkpatrick model which was the first one treated by the interpolation method. However as we will see 
the analysis of the randomly spread CDMA system is substantially different due to the structure of the 
interaction between degrees of freedom. 

1.2 Communication Setup 

We consider a scenario where K users send binary information symbols x — [x%, . . . ,xkY, Xk € {±1} 
to a common receiver, through a single AWGN channel. Each user k has a random signature sequence 
§_k = (sik, ■ Sjvit)' where the components are independently identically distributed. For each time 
division (or chip) interval i = 1, N the received signal y — (y\, yjy) is 

1 K 

Vi = -j=^ SlkXk + an% 

where n = (rii, ...,n.jv)* are independent identically distributed Gaussian variables Af(0, 1) so that the 
noise power is a 2 . The variance of Sik is set to 1 and the scaling factor 1/s/N is introduced so that 
the power (per symbol) of each user is normalized to 1. Our results hold for the rather wide class of 
distributions satisfying: 

Assumption A. The distribution p(sik) is symmetric 

p(sik) = p(-Sik) 

and has a rapidly decaying tail. More precisely, there exists positive constants sq and A such that Vs > sq 

P{s lk >s)< e~ As2 

Let us point out that, as will be shown later in this paper, the interpolation method can also serve as an alternative 
to random matrix theory for Gaussian inputs. 
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In particular, our favorite Gaussian and binary cases are included in this class, and also any compactly 
supported distribution. An inspection of our proofs suggests that the results could be extended to a 
larger class satisfying: 

Assumption B. The distribution p(sik) is symmetric with finite second and fourth moments. 

However to keep the proofs as simple as possible only one of the theorems is proven with such generality. 

In the sequel we use the notations s for the N x K matrix (sjfe), S for the corresponding random 
matrix, and A, Y_ for the input and output random vectors. 

Our main interest is in proving a "tight" upper bound on 

C K = lmaxEs[/(A;y)] (1) 

K px_ 

in the large-system limit K — > +00 with & = (3 fixed. In the next few paragraphs we discuss various 
settings for which it is justified to consider this formula as a capacity. In principle for multiaccess 
channels one maximizes over product distributions px(x) — Y\k=iPk( x k)- But in fact this restriction 
makes no difference when one maximizes the expected mutual information because the maximum is 
attained for a uniform distribution. Indeed for any given s the mutual information I(X; Y) is a concave 
functional of px and thus so is its average. Moreover the later is invariant under the transformations 
Px_{xii X2, xk) — ► Pxi^\X\, £2X2, £rXk) where e, = ±1. Combining these two facts we deduce that 
the maximum in |T]) is attained for the convex combination 

1}K Px{eiXi,...,e K x K ) = 

€ 1 ,...,e K 

which is nothing else than the product of uniform distributions for each user. Before discussing the 
meaning of |T]) for the CDMA setting let us note that it can also be interpreted as the capacity of a MIMO 
system with binary constellations, K transmit, N receive antennas, and ergodic channel coefficients Sjfe 
that are known to the receiver only [18] , [19] . 

In the traditional CDMA setting (see for example [2]) the spreading sequences are assigned to each 
user and do not change from symbol to symbol. Moreover it is assumed that the users and the receiver 
know s. The general analysis of multiaccess channels implies that the total capacity per user (or maximal 
achievable sum rate) is 

1 max I(X;Y) (2) 

K nE* 

where the maximum is over Pi(x) = PiS(x — 1) + (1 — Pi)S(x + 1) and pi <E [0,1], i — 1, k. In the 
large-system limit we are able to prove a concentration theorem for the mutual information I(X\ Y) 
which implies that if (pi, ...,pk) belongs to a finite discrete set T> with cardinality increasing at most 
polynomially in K, then @ concentrates on max p p-p Es [/(A; Y)]. Of course by the same argument 
as before this maximum is attained for p = | as long as \ £ T>. Unfortunately, in order to extend these 
arguments to the more realistic case of exponential cardinality of V, or even all possible continuous 
values of the input distribution (and thus to fully justify |T])) we would have to prove stronger forms of 
concentration. 

At this point it is interesting to discuss the situation for the continuous input case. There it is known 
that the maximum of j2]) is attained for a Gaussian input distribution independent of the spreading 
sequence realization [I]. Then the concentration theorems for I(X\ Y) suffice to prove that in the large- 
system limit ([2]) asymptotically equals (Q]). It is an open problem to decide if an analogous result holds 
in the binary input case, namely that the maximum of ([2]) is attained for the uniform distribution. We 
conjecture that this is the case. 

Alternatively, following [3] one may consider the case of "long spreading sequences" , that is sequences 
that extend over many symbol durations. Then by "ergodicity" one can compute the capacity as an 
expectation of @ over S. In the continuous input case it turns out that one can switch the expectation 
and the maximum because it can be shown (by the standard argument adapted above for the binary 
case) that the maximum of the expectation is attained for the same Gaussian input distribution. Thus, 
remarkably, in the continuous case one exchanges the expectation over S with the maximum over product 
distributions even for finite K. 

Finally let us return to the binary case and consider the situation of long spreading sequences as 
in [3] that are assumed to be unknown (or rather not used) to the encoder and known to the receiver. 
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Then, by the analysis in [TJ], formula fl} gives the capacity. If users do not cooperate px_ is really a 
product distribution. But in any case the maximum is attained for the uniform distribution. 

Let us now collect a few formulas that will be useful in the sequel. The conditional entropy H(_X \ 
Y_) = Ey\s[H(2L I y)] is the average over Y_ given s of the Shannon entropy for the posterior distribution 



with the normalization factor 

Z(y,s) = ^pxfe)e-^ ll ^ w4s£|2 (4) 

Note that this is the distribution used by the ideal or optimal detector. The average over Y_ is carried 
out with the distribution induced by the channel transition probability 

where in the sum x° is interpreted as the input signal. The normalization factor (U) can be interpreted 
as the partition function of interacting Ising spins Xk = ±1 with free measure px- In view of this it is 
not surprising that the free energy 

f(y,s) = ±lnZ(y,s) (6) 
plays a crucial role. In appendix [X] we show that it is related to the mutual information by 



Therefore 



I(X;Y) = -—-Ey l8 [f(y,s)] (7) 



C K = -±--minEY,s[f(y,s)] (8) 
2p vx_ — — 



Of course by the previous discussion the min px is attained for px_(x) — ^jr- 
1.3 Tanaka's formula for binary inputs 

By using the formal replica trick of statistical mechanics Tanaka reduced the calculation of the conditional 
entropy to a variational problem. His conjectural formula is 

lim Ck = niin Cngtm) (9) 

K— >oo me [0,1] 

where the "replica symmetric capacity functional" 

A 1 f 

CR S (m) = -(1 + m) - — InAcr 2 - / Dz ln(2 cosh(\/Az + A)) (10) 
2 Zp J 

with 

X = ^m t ( n ) 

(T Z + p{l — TOj 

z 2 

and Dz the standard Gaussian measure Dz = e ," a " dz, has to be maximized over a parameter^ to. It is 
easy □ to see that the maximizer must satisfy the fixed point condition 

m = J Dzta,nh(V\z + \) (12) 

2 this parameter can be interpreted as the expected value of the MMSE estimate for the information bits 

3 using integration by parts formula for Gaussian random variables 
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The formal calculations involved in the replica method make clear that the formula © should not depend 
on the distribution of the spreading sequence (see [7]). 

In the present problem one expects a priori that replica symmetry is not broken because of a gauge 
symmetry induced by channel symmetry. For this reason Tanaka's formula is conjectured to be exact. 
Our upper bound (Theorem [5]) on the capacity precisely coincides with the above formulas and strongly 
supports this conjecture. 

Recent work announced by Montanari and Tse [10] also provides strong support to the conjecture 
at least in a regime of (3 without phase transitions (more precisely, for j3 < (3 s (cr) where (i s (a) is the 
maximal value of j3 such that the solution of (fT2"]) remains unique). The authors first solve the case 
of sparse signature sequence (using the area theorem and the data processing inequality) in the limit 
K — > oo. Then the dense signature sequence (which is of interest here) is recovered by exchanging the 
K — ► co and sparse — » dense limits. 



1.4 Gaussian inputs 

In the case of continuous inputs Xk £ R, in formulas |J4|, §5§ are replaced by J dx. The capacity is 
maximized by a Gaussian prior, 

Hall 2 

PX(2L) = ( 13 ) 

and one can express it in terms of a determinant involving the correlation matrix of the spreading 
sequences. Using the exact spectral measure given by random matrix theory Shamai and Verdu [3] 
obtained the rigorous result 

lim C K =\ log(l + cj- 2 - -0{a- 2 , (3)) 



+ is Ml +<r-'l3- -Q('~ 2 < P)) - Ta— T < 14 ) 

where 



Q(x, z) = ^x(l + V^) 2 + l - ^x{l-^.) 2 + lJ 
On the other hand Tanaka applied the formal replica method to this case and found ([9]) with 

CRs(m) = \ log(l + A) - ^ log Xa 2 - ^(1 - m) (15) 



where A = (a 2 + (3{\ — m)) 1 . The maximizer satisfies 

A 



1 + A 



(16) 



Solving (fT6|) we obtain m — j^Q(<J 2 ,/3) and substituting this in (fT"S"|) gives the equality between (fT4|) 
and (|15p . So at least for the case of Gaussian inputs we are already assured that the replica method 
finds the correct solution. 

As we will show in section \7. 31 our methods also work in the case of Gaussian inputs, and yield the 
upper bound. 



1.5 Contributions and organization of this work 

The main focus and challenge of this work is on the case of binary inputs for the communication set 
up described above, although the methods also work for many other constellations including Gaussian 
inputs. The main results are explained in section [2] while the remaining sections are devoted to the 
proofs. 

We prove concentration of the mutual information in the limit of K — > +co and /3 = It fixed 
(Theorems [TJO in section |2~T|) . As we will see the mathematical underpinning of this is the concentration 
of a more fundamental object, namely, the "free energy" of the associated spin system (Theorem^. In 
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fact this turns out to be important in the proof of the bound on capacity. When the spreading coefficients 
are Gaussian the main tool used is a powerful theorem [9] of the concentration of Lipschitz functions 
of many independent Gaussian variables, and this leads to subexponential concentration bounds. For 
more general spreading coefficient distributions such tools do not suffice and we have to combine them 
with martingale arguments which lead to weaker algebraic bounds. Since the concentration proofs are 
mainly technical they are presented in appendices [B] [Cl 

Sections [3] and [4] form the core of the paper. They detail the proof of the main Theorem [6] announced 
in section \2. 41 namely the tight upper bound on capacity. We use ideas from the interpolation method 
combined with a non-trivial concentration theorem for the empirical average of soft bit estimates. 

Section [5] shows that the average capacity is independent of the spreading sequence distribution at 
least for the case where it is symmetric and decays fast enough (Theorem @] in section |2"T2"|) . This enables 
us to restrict ourselves to the case of Gaussian spreading sequences which is more amenable to analysis. 
The existence of the limit K — > oo for the capacity is shown in section [6] 

Section [7] discusses various extensions of this work. We sketch the treatment for unequal powers for 
each user as well as colored noise. As alluded to before the bound on capacity for the case of Gaussian 
inputs can also be obtained by the present method and we give some indications to this effect. 

The appendices contain the proofs of various technical calculations. Preliminary versions of the 
results obtained in this paper have been summarized in references [20] and [21] . 



2 Main Results 

2.1 Concentration 

In the case of a Gaussian input signal, the concentration can be deduced from general theorems on the 
concentration of the spectral density for random matrices, but this approach breaks down for binary 
inputs. Here we prove, 

Theorem 1 (concentration of capacity, Gaussian spreading sequence, binary inputs). Assume 
the distribution p(sik) are standard Gaussians. Given e > 0, there exists an integer K\ — 0(|lne|) 
independent of px , such that for all K > K\ , 

mi(X;Y) - E S [I(X;Y)}\ > eK] < 3e~ Ql£2j< " 
where ct x = ^ct 4 (64/3 + 32 + cr 2 )" 1 . 



The mathematical underpinning of this result is in fact a more general concentration result for the 
free energy ©, that will be of some use latter on. 

Theorem 2 (concentration of free energy, Gaussian spreading sequence, binary inputs.). 

Assume the distribution p(sik) are standard Gaussians. Given e > 0, there exists an integer K 2 = 
0(| lne|) independent of px_, such that for all K > Ki, 

P[|/(y, s) - EyAM «)]| > e] < 3e~ a ^^ 
where a 2 = ^a 4 (3% (2^//3 + a)~ 2 . 



We prove these theorems thanks to powerful probabilistic tools developed by Ledoux and Talagrand for 
Lipschitz functions of many Gaussian random variables. These tools are briefly reviewed in Appendix 
|B1 for the convenience of the reader and the proofs of the theorems are presented in Appendix [C] 
Unfortunately the same tools do not apply directly to the case of other spreading sequences. However 
in this case the following weaker result can at least be obtained. 

Theorem 3 (concentration, general spreading sequence). Assume the spreading sequence satisfies 
assumption B. There exists an integer K\ independent of px, such that for all K > K\ 

ct 

n\HK;Y)-Es[I(X;Y)]\ >eK] < 



¥[f{y,a)-Ey, s [f{y,s)]\>e] < 



Ke 2 
a 



Ke 2 

for some constant a > and independent of K. 
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To prove such estimates it is enough (by Chebycheff) to control second moments. For the mutual 
information we simply have to adapt martingale arguments of Pastur, Scherbina and Tirrozzi, [221 123] 
whereas the case of free energy is more complicated because of the additional Gaussian noise fluctuations. 
We deal with these by combining martingale arguments and Lipschitz function techniques. 

The concentration of capacity, namely 



|max/(X; F) - maxE s [J(X; Y)]\ > eK] < (17) 
px px. Ae 



would follow from a stronger (uniform concentration with respect to px) 

P[max|7(X;y) - E s [7(X ; y)]| > eK] < (18) 
px_ Ke z 

To see this it suffices to note that for two positive functions / and g we have | max/— maxg| < max |/— g\. 
But unfortunately it is not clear how to extend our proofs to obtain (|18p . However as announced in the 
introduction we can deduce (|18p from our theorems, by using the union bound, as long as the maximum 
is carried out over a finite set (sufficiently small with respect to A") of distributions. 

We wish to argue here that Theorem [2] suggests a method for proving the concentration of the bit 
error rate (BER) for uncoded communication 



I 1 K 

5 (l--^io^i) (19) 



k=l 

where the MAP bit estimate for uncoded communication is defined through the marginal of ([3]) , namely 
Xk = B 1 Tgmax Xk= ^ ±1 jp(xk | y, s). We remark that 

x k = sign(x fe ) 

where we find it convenient to adopt the statistical mechanics notation (— ) for the average with respect 
to the posterior measure ([3]). For example the average 



[Xk) 



^2 x kP( x . I V, s ) 



(a soft bit estimate or "magnetization") can be obtained from the free energy by adding first an in- 
finitesimal perturbation ("small external magnetic field") to the exponent in ([3]), namely ft*X}fc=i x®Xk, 
and then differentiating the perturbed free energjQ, 



1 K d 1 

k=\ 



However one really needs to relate sign(a;fc) to the derivative of the free energy and this does not appear 
to be obvious. One way out is to introduce product measures of n copies (also called "real replicas" ) of 
the posterior measure 

p(x<M | y, s)p(xj 2) | y,s)... p{x {n) | y, s) 



and then relate 

K K 

^^( x k( x k)) n = *^2( x k x k--- x k X k)n 

k=l k=l 



to a suitable derivative of the replicated free energy. Then from the set of all moments one can in 
principle reconstruct sign(aifc). Thus one could try to deduce the concentration of the BER from the one 
for the free energy. However the completion of this program requires a uniform, with respect the system 
size, control of the derivative of the free energy precisely at h = 0, which at the moment is still lacking. 



4 we do not write explicitly the h dependence in the perturbed free energy 
5 however this can be done for Lebesgue almost every h 
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2.2 Independence with respect to the distribution of the spreading sequence 



The replica method leads to the same Tanaka formula for general class of symmetric distributions 
p(sik) — p(~Sik)- We are able to prove this: in particular binary and Gaussian spreading sequences lead 
to the same capacity. 

Theorem 4. Consider CDMA with binary inputs and assume A for the spreading sequence. Let C g be 
the capacity for Gaussian spreading sequences (symmetric i.i.d with unit variance). Then 

lim K ^+oo{C K -C g ) = 

This theorem turns out to be very useful in order to obtain the bound on capacity because it allows 
us to make use of convenient integration by parts identities that have no clear counterpart in the non- 
Gaussian case. The proof of the theorem is given in section 

2.3 Existence of the limit K — > +oo 

The interpolation method can be used to show the existence of the limit K — > +oo for Ck- 

Theorem 5. Consider CDMA with binary inputs and assume A for the spreading sequences with uniform 
input distribution. Then 

lim Ck exists (20) 

The proof of this theorem is given in section [5] for Gaussian spreading sequences. The general case 
then follows because of Theorem |4j 

2.4 Tight upper bound on the capacity 

The main result of this paper is that Tanaka's formula (|10|) is an upper bound to the capacity for all 
values of (3. 

Theorem 6. Consider CDMA with binary inputs and assume A for the spreading sequence. We have 

lim Ck < min crs(to) (21) 

K->oo m£[0,l] 

where c/js(to) is given by i !()[) . 



If we combine this result with an inequality in Montanari and Tse |10) , and exchanging as they do 
the limits of K — > +oo and sparse — > dense, one can deduce that the equality holds for some regime 
of noise smaller than a critical value. This value corresponds to the threshold for belief propagation 
decoding. Note that this equality is valid even if j3 is such that there is a phase transition (the fixed 
point equation (fT2"|) has many solutions), whereas in [TU] the equality holds for values of (3 for which the 
phase transition does not occur. 

Since the proof is rather complicated we find it useful to give the main ideas in an informal way. The 
integral term in (|10[) suggests that we can replace the original system with a simpler system where the 
user bits are sent through K independent Gaussian channels given by 

Vk =x k + -^j w k (22) 

where Wk ~ A/"(0, 1) and A is an effective SNR. Of course this argument is a bit naive because this 
effective system does not account for the extra terms in ([10)1. but it has the merit of identifying the 
correct interpolation. 

We introduce an interpolating parameter t S [0, 1] such that the independent Gaussian channels 
correspond to t — and the original CDMA system corresponds to t = 1 (see Figure |2T4"[) It is convenient 
to denote the SNR of the original Gaussian channel as B (that is B = er~ 2 ). Then ([TT[) becomes 



l + /3B(l-m) 
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We introduce two interpolating SNR functions X(t) and B(t) such that 

A(0) = A, 5(0) = and A(l) = 0, B{1) = B (23) 

and 

m 1 m - = i * (24) 



l + j3B(t)(l-m) w l + /3B(l-m) 

The meaning of (|24|) is the following. In the interpolating i-system the effective SNR seen by each user 
has an effective t-CDMA part and an independent channel part X(t) chosen such that the total SNR 
is fixed to the effective SNR of the CDMA system. There is a whole class of interpolating functions 
satisfying the above conditions but it turns out that we do not need to specify them more precisely except 
for the fact that B(t) is increasing, X(t) is decreasing and with continuous first derivatives. Subsequent 
calculations are independent of the particular choices of functions. 

The parameter m is to be considered as fixed to any arbitrary value in [0, 1]. All the subsequent 
calculations are independent of its value, which is to be optimized to tighten the final bound. 

We now have two sets of channel outputs y (from the CDMA with noise variance B{t)~ l ) and y (from 
the independent channels with noise variance A(t) -1 ) and the interpolating communication system has 
a posterior distribution 

ft( -^' s) = 2*z(L,s) °q>(-^llg-^~*3sll 2 - ^lli-£ll 2 ) (25) 

Note that here we take without loss of generality Px_(X) = ^tt. By analyzing the mutual information 
Kft[It(X; Y, Y)] of the interpolating system we can relate ^s[I(X_;Y_)] (the t = 1 value) to the easily 
computed entropy Es[Jn(X; Y)] of the independent channel limit. The average over (Y_,Y) is now 
performed with respect to 

Pt(y,y\ s) = ±Y— 1 - e -^lh^-Wll 3 -^llj-*T (26 ) 

These equations completely define the interpolating communication system. 
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In order to carry out this program successfully it turns out that we need a concentration result on 
empirical average of the "magnetization" , 

1 K 

m i = j? x ° Xk 

k=l 

which, as explained in section |2~TI is closely related to the BER. Informally speaking we need to prove 
that the fluctuations of E(|mi — E(mi)|) are small. This involves the control of two types of fluctuations, 
E(|mi — and E|(mi) — E(mi)| (by the triangle inequality). In some spin glass problems both 

type of fluctuations need not be small at the same time. Indeed it is a quite general fact that the first 
one is small for thermodynamic (or convexity) reasons while the smallness of the second is not assured 
if replica symmetry breaking occurs (see [9j). Here we use a crucial ingredient that is specific to the 
communication set up, namely the channel symmetry, which induces a gauge symmetry and prevents 
replica breaking. This, it turns out, allows to prove that both fluctuations are small. The control of 
these fluctuations is the object of Theorem [7] in section I3~51 There are technical complications that we 
have to deal with because such control of fluctuations is only possible away from phase transitions. For 
this reason we have to add small appropriate perturbations to the measure (|25l) and give almost sure 
statements with respect to the strength of the perturbation. By being sufficiently careful with the order 
of limits the extra perturbation terms can be removed at the end of the calculations. 



3 Proof of bound on capacity: Theorem [6] 
3.1 Preliminaries 



The interpolating communication system defined by the measure (|25p allows us to compare the original 
CDMA system with the independent channel system. The distribution of y, y is given by (|26|) . This 
distribution consists of a summation of 2 K terms, each corresponding to different possible input sequence. 
Each of these terms contribute equally to the capacity (free energy). The reader can explicitly check 
this by making the change of variables Xk — > x Q k Xk and Sik — > SikX k , Wk — > WkX k , hk — * hkx k which 
leave all standard Gaussians invariant. Hence we can assume that a particular input sequence say x° is 
transmitted. The distribution of the received vectors with this assumption is 

Pt(y,y\ s) = 1 r e -^\\y_-N-i^f-^\\y-^f (27) 

For technical reasons that will become clear only in the next section we consider a slightly more general 
interpolation system where the perturbation term 

K K K 

h u (x) = \fu^2 h kXk + u^x° k x k - Vu^2 \hk\ (28) 

fe=l k=l k=l 

is added in the exponent of the measure (|25p . Here hk are i.i.d. hk ~ 7V(0, 1). For the moment u > 
is arbitrary but in the sequel we will take u — > 0. This time it is convenient to perform a new change of 
variables y = _B(t) _1 / 2 n + iV _1 / 2 sx° and y = A(i) _1 / 2 w + x°, where m, w l ~ 7V(0, 1) and we set (— ) t . u 
for the average corresponding to the posterior measure 

Pt,u{x\n,w,h,s) - ^— expf -hn + N-i B(t)h{x° - x)\\ 2 (29) 
Zt,u \ 2 

-\\\vL + X{t)^{x°-x)f + K{x) 
with the obvious normalization factor Z t u . We define a free energy 

ft,u{n,w,h, s) = —\nZ t u (30) 

K 

For t — 1 we recover the original free energy, 

E[/(z/,s)] = ~ + lim s)] 

— i u— >0 
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while for t — the statistical sums decouple and we have the explicit result 

1 If 

— + lim E[/o,«(n, w, h,s)] = -—-X+ / Dzln(2cosh(VAz + A)) (31) 

2 u—>o 2p J 

where E denotes the appropriate collective expectation over random objects. In view of formula |(7J) in 
order to obtain the average capacity it is sufficient to compute 

lim ]imE[f hu (n,w,h,s)} + l- (32) 

K— »+oo u—<-0 A 

There is no loss in generality in setting 

4 = 1 (33) 

for the input symbols. From now on in sections 13141 and [6] we stick to (|33|) . We also use the shorthand 
notations 

z k = x ° k -x k = l- x k , ft,u(n,w,h, s) = f t>u 
Using < Zy/uY^k l^ fe l + ^ u it easily follows that (u small) 

\E{ft, u ] - E[/t,o]| < 2^/vE[\h k \] + u (34) 
therefore we can permute the two limits in (|32|) and compute 



lim lim nh, u ] + \ 

u— >0 K— >+oo Z 

From now on we keep the limits in that order. By the fundamental theorem of calculus, 

f 1 d 

E[/i,«] = E[/ 0) „] + J dt-E[f t , u ] (35) 
Our task is now reduced to estimating 

f 1 d 
lim lim / dt—E\f t „1 

This is done in sections 13. A\ 13.51 This requires a few preliminary results that are the object of sections 
1X211331 



3.2 Nishimori identities 

As already alluded to in the introduction the "magnetization" plays an important role 

K 



i = 4l> (36) 



K 

fe=i 



A closely related quantity is the "overlap parameter" 

K 

1 

912 



= ^E4 1} 4 2) (37) 



K 

k=l 



where x k and x k 2 ^ are independent copies ("replicas") of the Xk- This means that the joint distribution 
of (x£\x?') is the product measure 



Pt 



(x} 1 ' \n, w, h, s)pt (x^ \n, w,h,s) 



The average with respect to this joint distribution is denoted (by a slight abuse of notation) with the 
same bracket (— }t, u - The important thing to notice is that the replicas are "coupled" through the 
common randomness (n, w, h, s) . 



3 it is also straightforward to compute the full u dependence and see that it is O(^Ju), uniformly in K 
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Lemma 1. The distributions of m\ and q\2 defined as 

P mi (x) = E(6(x - mx)) t , u , P qi2 (x) = E{S(x - q 12 )) t . u 

are equal, namely 

P mi (x)=P qi2 (x) 
In particular the following identity holds 

E[(mi) t , u }=E[(q 12 ) t , u } (38) 

Such identities are known as Nishimori identities in the statistical physics literature and are a consequence 
of a gauge symmetry satisfied by the measure E(— )t,u- They have also been used in the context of 
communications (see [11 ] , [16 ] ). For completeness a sketch of the proof is given in Appendix [Fl 
The next two identities also follow from similar considerations. 

Lemma 2. Let 



B(t) 
— — sz 
N ~ 

?f a ^ rv = 1 9 rnrrpsrtrvn /iin n in ?( a ^ — 1 — 



Consider two replicas Z} a> , a — 1,2 corresponding to z\. — 1 — x y k . We have then 

1 

N 



Imwzw 2 )^} = i (39) 



E[(("-2 (2) )U« -z< 2 >)) t , u ] =5>[((n-Z)z fc ) t ,J (40) 

k 

3.3 Concentration of Magnetization 

A crucial feature of the calculation in the next paragraph is that mi (and (712) concentrate, namely 
Theorem 7. Fix any e > 0. For Lebesgue almost every u > e, 

lim / dm{\m x - E(mi) t ,u\)t = 

The proof of this theorem, which is the point where the careful tuning of the perturbation is needed, 
has an interest of its own and is presented section |4] Similar statements in the spin glass literature 
have been obtained by Talagrand [S] • The usual signature of replica symmetry breaking is the absence 
of concentration for the overlap parameter qi2- This theorem combined with the Nishimori identity 
"explains" why the replica symmetry is not broken. 

We will also need the following corollary 

Corollary 1. The following holds 

E{(n ■ sz)(l - mi)) t ,u = -rprp;E(ji ■ sz) t , u (l - E{mi) ttU ) + o N (l) 



7V3/2 u - 1,ll - u TV 3 / 2 

with limjv^+oo ojv(1) = for almost every u > 0. 
Proof. By the Cauchy-Schwartz inequality 

^E((n • sz)(E( mi ) t . u - mi))t, u <j^(E((n ■ sz) 2 ) t , u )^ 2 

x (E((E(m 1 ) t , u -m 1 ) 2 ) Ml ) 1 /2 

Because of the concentration of the magnetization mi (theorem [7]) it suffices to prove that 

'(N-?j2 n ^ z i) 2 ) tu ^ D ( 41 ) 



for some constant D independent of N. The proof follows from the central limit theorem and is given 
in Appendix [G] □ 



12 



3.4 Computation of ^E[f t)U ] 
We have 



^E[/ t , u ] = T 1+ T 2 (42) 



where 



and 



Tt = --^L-E{w ■ z) ttU - ^E(z ■ z) t , u (43) 



T 2 = %= B 'J-L E(Z ■ az) t u (44) 



3.4.1 Transforming T\ 

Integration by parts with respect to Wk leads to 

Tx = — ^L-E((u, + y/Mfiz) ■ *>*,« 
l\j \(t)K 



\'{t) 



: E{zW ■ (w + VWh (2) ))t,u - ^E(z • z) t 



2^X{t)K w v v 7 " 2K 

= -^E<1 - 2m x + q 12 ) t = -^E<1 - mi ) ( ,„ 

To obtain the second equality we remark that the w terms cancel and for the third one follows from 
([38]) . From the relation between A(i) and Bit) given in equation (|24|) . T\ can be rewritten in the form 

ri= 2(l + ^(f-i)W E<1 " mi)t -" (45) 

3.4.2 Transforming T 2 

The term T2 can be rewritten as 

#(*) 



2y/mT)KVN 
Because of (j3"5)) the first two terms cancel, 



E(n • sz)t,u 



T 2 = E(n ■ sz) t , u (46) 



Now we use integration by parts with respect to s 



ih : 



T2 = -^E<(». ( S)(2. 2 )) t ,.+ ^E<(a.^)(s«-^))*,« 



and the Nishimori identity J40 



k 



^^wT,n\n\\ 2 (z k ) t ,u} 



2 NK 

k 



B' it) J Bit) TT^ mlr 
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Since ■^■||n|| 2 = ^ J2i n i concentrates on 1, we get 



T 2 = - ^/?E(1 - mi) t , w + ojv(l) 



2KN 1 / 2 — "~ ' ~ mi "*' u 

Applying Corollary Q] to the last expression for T2 together with (|4T))) we obtain a closed affine equation 
for the later, whose solution is 

B'(t)E(l- mi ) t 
2(1 + /iB(i)E(l - mi)t, u ) 

3.5 End of proof 

We add and subtract the term ^ ln(l + 0B(1 — m)) from and use the integral representation 



2/3 y l + /3B(i)(l-m) 

to obtain 

£'(*)(! - m) 



nfi,u] = E[/o, u ] - ^ ln(l + - ,71)) + jf di ^E[/ t ,„] + 



+ pB(t)(l-m)) 

If one uses (j4"2"j) and expressions (|4"5)) . (|4T|) some remarkable algebra occurs in the last integral. The 
integrand becomes 

B'(t)(l-m) 
R ^ + 2(1 + [3B(t)(l - m)) 2 

with 

ffm = /3^ft)j?(Q(E(m 1 -m) t , M ) 2 
U 2(1 + 0B(t)(l - m))2(l + /3B(i)E(l - mi) tl1t ) 

So the integral has a positive contribution J 1 dtR(t) > plus a computable contribution equal to 

B(l-m) _ A, 

2(l+/3S(l-m)) 2 ' 



B(1 ~ m) — - 4(1 - m). Finally thanks to ([311) we find 



i + E[/ 1)U ] = J Dz ln(2 cosh(VAz + A)) - -L - J- i n (i + ^(1 _ m )) 

--(l + m)+ / i?(t)dt + ojv(l) + 0(Vw) (48) 



where for a.e u > e, limAr^oo Ojv(l) = 0. We take first the limit AT — * 00, then u — > e (along some 
appropriate sequence) and then e — > to obtain a formula for the free energy where the only non- 
explicit contribution is J dtR(t). Since this is positive for all m, we obtain a lower bound on the free 
energy which is equivalent to the announced upper bound on the capacity. 

4 Concentration of Magnetization 

The goal of this section is to prove Theorem [7l The proof is organized in a succession of lemmas. By 
the same methods used for Theorem [2] we can prove 

Lemma 3. There exists a strictly positive constant a (which remains positive for all t and u) such that 

P[|/ t ,«-E[/ t ,J| >eHO(e- QeW ) 

The perturbation term ([28]) has been chosen carefully so that the following holds, 
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Lemma 4. When considered as a function of u, ft. u is convex in u. 



Proof. We simply evaluate the second derivative and show it is positive. 



where we have defined 



Differentiating again, 



dft. 



du 



L(x) = = hhXk H 



Xk 



d 2 ft.u 1 / -1 



du 2 K\4u 3 / 2 ^ K /t.u Au*/ 2 K 

k 

+ K((L(x) 2 ) t , u -(L(x))l u )>0 



(49) 
□ 



The quantity L(x) turns out to be very useful and satisfies two concentration properties. 
Lemma 5. For any a > e > fixed, 



duE 



L(x) - {L{x)) t ,u 



= o( 



Proof. From equation (|49p . we have 



<U^nft, a }-^nft,e] 

K \du du 



O 



In the very last equality we use that the first derivative of E[/ tjU ] is bounded for u > e. Using Cauchy- 
Schwartz inequality for J'E(—) t u we obtain the lemma. □ 



Lemma 6. For any a > e > fixed, 



duE 



(L(x)) t , u -E(L(x)) t , u =0(-V) 



Proof. From convexity of /t iM with respect to u (lemma EJ we have for any S > 0, 



< 



ft,u+S — ^[ft,u+s] ft.u — IE[/t jU ] 



A similar lower bound holds with S replaced by —S. Now from Lemma |3] we know that the first two 
terms are 0{K*), Thus from the formula for the first derivative in the proof of Lemma @] and the fact 
that the fluctuations of J2k=i \^k\ are Oi^/^) we S e * 



[L(x)) t , u -E(L(x)}t, u 



<V 1 



8 \^kJ 5 \KiJ 



du 



t,u+S\ 



)o( 1 

d 

du 



nun] 
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We will choose S = -=r. Note that we cannot assume that the difference of the two derivatives is small 

because the first derivative of the free energy is not uniformly continuous in K (as K — ► oo it may 
develop jumps at the phase transition points). The free energy itself is uniformly continuous. For this 
reason if we integrate with respect to u, using (f34|) we get 



duE 



(L(x))t, tt - E{L(x)) ttU < O(-Lr) 

□ 

Using the two last lemmas we can prove Theorem [7J 
Proof of Theorem [?}■ Combining the concentration lemmas we get 

/ duE(\L(x) - E(L(x)) t}U \) t}U < O(-V) 

For any function g(x) such that |g(x)| < 1, we have 

du\E{L{x)g{x)) t<u - E(L(x)) t . u E(g(x)) t . u \) t . u < J duE(\L{x) - E(L(x)) t . u \) t , u 

More generally the same thing holds if one takes a function depending on many replicas such as 
g{xy^ , x^) = qi2- Using integration by parts formula with respect to hk, 

HHx)q 12 ) t , u =E(—^) h k x k q 12 ) +E( 
\ ZK y/u I t,u 

= 2 E ((! + l ?i2)'7i2>t,u - 2 E ((?i3 + qi4)qi2)t,u + E(migi2)t,„ 

= -E((l + q 12 )qi2)t,u = 2 E (^i + m l)t,u (50) 



where in the last two equalities we used the Nishimori identity (|38p . By a similar calculation, 

E{L{x)) t<v E{q 12 ) t<u = - q 12 + 2m 1 ) t , lt E( (Zl2 ) t , u 

= i(E(m 1 ) t + (E(m 1 ) t ) 2 ) (51) 

From equations ([50)) and (|5T|) . we get 

/ du\E{m\) t , u - (E(m 1 ) t ,„) 2 | < o(-^A 
Now integrating with respect to t and exchanging the integrals (by Fubini's theorem), we get 

f du f dt\E{m\) t , u - (E(m 1 ) t , M ) 2 | < o(-^j-) 

Jt Jo v itis' 

The limit of the left hand side as K — > 00 therefore vanishes. By Lebesgue's theorem this limit can be 
exchanged with the u integral and we get the desired result. (Note that one can further exchange the 
limit with the ^integral and obtain that the fluctuations of mi vanish for almost every (t,u)). 



5 Proof of independence from spreading sequence distribution: 
Theorem |4] 

We consider a communication system with spreading values generated from a symmetric distribution 
with unit variance and satisfying assumption A. We compare the capacity of this system to the Gaussian 
Af(0, 1) case whose spreading sequence values are denoted by Sjfc. The comparison is done through an 
interpolating system with respect to the two spreading sequences 

Vik (t) = Virtk + Vl - tsik , < t < 1 



1G 



Let v(t) denote the matrix with entries Ujfe(t) and let v_i(t) denote the ith row of the matrix. By the 
fundamental theorem of calculus the capacities are related by 

C K - C g = E R [C(r)] - Es[C(s)] = J dt^E v(i) [C(v(t))] 



From ([7]) the derivative is equal to 

d 



dt 



Ev(t)[C(v(t))] = -E s E R -Ey|v (t )[/(y,v(t)] 



As before we can assume that the transmitted sequence is s° . It is convenient to first perform the change 
of variables y = n + -/V _1 / 2 v(i);E and then perform the t derivative. One finds 



Es,rjv( ( n + —v(t){x° -x))- v'(t)(x° - x) 



±E nt) [C(v(t))} = — I 
at a 2 K\/N 

where (— )t is the average with respect to the normalized measure 

2^ exp(-^||n - N-iv(t)(x° - z)|| 2 ) 
We split ()52p in two contributions Ti — T 2 corresponding to 



For T\ we have 



v'(t) 



1 



1 



(52) 



(53) 



(54) 



i.k 



with 



1 



9ik = 



a 2 KVN 



a+-T=v(t)(£°-x)) (X° k -X k ) 



(55) 



For T2 we have 



T 2 = T 2 (i, k) = - i_ Y E s ,Rjv[s lfc ftfc] 

i.k i.k 



(56) 



with the same expression for gn~. For each contribution in the sums (|54p . (|56p we use integration by 
parts formulas. For (|54p we use the formula (it is an exercise to check that it is valid for any symmetric 
random variable) 



nr tk g(r tk )] = E[rf k 



2 dg{r lk ) 



dr ik 



\nk\ / [r ik -u ) 



-In* I 



E[^M ]+E 
or ik 



du 2 



du 



-E 



9m 3 

|r«| 



-|r ifc | 



(r 2 u ^ gW 



and for (|5^|) we use the standard Gaussian (unit variance) integration by parts formula 

rdgisik) 



E[s lk g(s lk )] = El- 



ds,, 



(57) 



(58) 



When we consider T\ — T 2 the term corresponding to the expectation in (f5"5|) cancels with that of the 
first expectation in (|57[) and we get 



Ti - T 2 = 



2^ 



i.k 



{rl ~ 1) 



r " d 2 g lk {u) 
du 2 



du 



i.k 



M / fofc -« ) ~ - rfM 

-|n*l 



(59) 
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It remains to prove that both terms with the partial derivatives tend to zero as N — > +00. This 
computation is rather lengthy and is deferred to Appendix [EJ but for the convenience of the reader we 
point out the mechanism that is at work. On the expression for one sees that when the J^j and -§^3 
derivatives are performed extra powers JV _1 and N~ 3 / 2 are generated. Therefore we get 



E 

and 



du 2 k 



E 



( r *fc - 1) / -sor du ik 

n 

nk\ I (r'ik -uj k )^-%-du lk 



0{N- 5/2 ) (60) 



d*g 



In* I 9z 4 



0(A^- 3 ) (61) 



Since one sums over ifiV terms one finds that the final contributions are 0(iV ' ) and 0(N ). 

6 Proof of existence of limit : Theorem [5] 

Let us recall the following relation between the free energy and the capacity. 

CW = ^-E[/(y,s)] (62) 

where f(y,s) is defined in ^ with px_(x) = ^r. This implies that it is sufficient to show the existence of 
limit for the average free energy Tk = E[/(j/,s)]. The theorem is proved by showing that the sequence 
KTk is super additive, KTk > K\Tk x + K2TK2 f° r K = Ki + ^2- From standard theorems it then 
follows that the limit Tk exists. As in the previous sections, working directly with this system is difficult 
and hence we perturb the Hamiltonian with h u (x) as defined in (|28[> . 



H u {x) = - 7 ^\\n+-^=s(l-x)\\ 2 + h u {x) (63) 
£v V N 

Let us define the corresponding partition function as Z u and the free energy as Tk(u) = -hM\hiZ u ]. The 
original free energy is obtained by substituting u = 0, i.e., Tk = Tk(0). From the uniform continuity of 
Tk(u), it is sufficient to show the convergence of Tk(u) for some it close to zero. Even this turns out to 
be difficult and what we can show is the existence of the limit J _ Tk(u)cIu for any a > e > 0. However 
this is sufficient for us due to the following: from the continuity of the free energy with u (|34|) we have 

(T K (u) - \C)(l)\Vu)du < eT K < J (T K (u) + \0(l)\V^)du 

Since the limit of the integral exists, we have 

|limsupj c A' — liminf Tk\ < |0(l)|-\/e 

K A'^oo 

This e can be made as small as desired and hence the theorem follows. 

Let K = K\ + K2 and let ^, 4^ G N. This assumption can be removed by considering their 
integer parts. But we will stick to this assumption to simplify the proof. Split the N x K dimensional 
spreading matrix s in to two parts of dimension Ni x K and N2 x K and denote these matrices by Si , S2 
respectively. Let ti, t2 be two spreading matrices with dimensions N\ x K\ and N2 x K.2- All the entries 
of these matrices are distributed as A/"(0, 1) and the noise is Gaussian with variance a 2 . Similarly split 
the noise vector n = (ni,n 2 ) where is of length Ni and x = (x 1 ,x 2 ) where x^ is of length Ki. Let us 
consider the following Hamiltonian: 



-^\\n 2 + -AsaCl - x) + ^JJJ2t 2 (l - x 2 )\\ 2 + h u (x) 



Note that the all-one vectors 1 appearing above are of different dimensions (the dimension is clear from 
the context). For a moment neglect the h u (x) part of the Hamiltonian and consider the remaining 
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part. At t = 1, we get the Hamiltonian corresponding to an N x K CDMA system with spreading 

matrix Sl . At t = we get the Hamiltonian corresponding to two independent CDMA systems with 
S2 J 

spreading matrices t^ of dimensions Ni x Ki. As before we perturb the Hamiltonian with h u (x) so that 
we can use the concentration results for the magnetization. 

Let Z t , u be the partition function with this Hamiltonian and the corresponding average free energy 
is given by g t , u = ^E[lnZ ti J. Note that # ljtl = T K {u) and g ^ u = S±T Kl (u) + ^-J 7 K2 (u). From the 
fundamental theorem of calculus, 

f 1 d 

gi,u = go,u + J jj:9t,udt (64) 

Let = 1 — x^ = n { + ^J~jjSiZ_ + J ^^tiz^ Using integration by parts formula with respect to the 
spreading sequences, the derivative can be simplified as follows 

i=l,2 

i=l,2 1 

The system with Hamiltonian Ht, u (x) has Nishimori symmetry and hence we can derive results similar 
to Theorem [7] and Lemma ??. In addition to these we need one more Nishimori identity which we did 
not use before. 

e((^'.^»)(^«>.^-^".^)) io 

= E(( a ,-a)(ii- 4 -ii- ii )) titi (66) 

Let 

K Ki K 

mi = —^2xj, 77iii = -g- x h and mi2 = — x i 
Let e > be fixed. Using -L E(||^ i || 2 ) tilI = 1 and Theorem[71 for a.e., u > e and a.e., t > 0, we get 
-r:9t,u =2j(Z4 E E (^; ' £») u E ( mi ~ m i*k« + 

i=l,2 



= 27^4 E E (^' (V^ s ^+ V^W U -^)t E (™i-mu)t,u + o K (l) (67) 

i=l,2 V V i >« 

Now using integration by parts formula with respect to the spreading sequences, and doing transforma- 
tions similar to section [3.4.21 we get for a.e., u > e and a.e., t > 0, 



d 1 A I E((l-mi)^+(l-mn)(l-t)) t , M 

eft; 2Atr 4 ,f^ 2 1 + /3cr 2 E((1 - mijrj + (1 - mu)(l - t)) t ,u 

KiE(mi - mii) t ,u 



1 + /3<t- 2 E((1 - mi)i + (1 - m H )(l - t)) t , u + ° K ( } ( } 



Let us define a function 77 a ,6 1 ,6 2 (t) as follows, 

#i(a-&i) 



_ L , 2 - , 0<T- 2 ((l-a)t+(l-h)(l-t)) 



Note that for a = E(mi) t . n , 6, = E(mij)t ill we get the summation in (|68p . When a, 6^ satisfy 

fl = + (69) 
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the function rjaMM^) nas the following useful properties: 770,61,62(1) = an d the derivative with t of 
this function given by 

_J_y W°-M 2 

Therefore for any a, 6, satisfying (|69|) . 77 ,6i,6 2 (^) < an d hence we can claim the summation in (|68|) is 
also non-positive. 

Bringing the ok(X) in l|68|) to the left, we get for a.e., u > e, 



J j t 9t,u + o K {\)<Q (71) 



Therefore for a.e., u > e, we get 



Let a > e be a constant. Then 



ffi.u + Oic(l) < 30,11 (72) 



gi, u du + o K (l) < / g 0jU du 



which implies 



T K {u)du + o K (l) < — I T Kl (u)du + — / J 7 K2 (u)du 



which in turn implies that limx^oo / ^K(u)du exists. 

7 Extensions 

In this section we briefly describe three variations for which our methods extend in a straightforward 
manner. 

7.1 Unequal Powers 

Suppose that the users transmit with unequal powers Pfc, 

K 



Vi = —j= S lk V~P k Xk + (TTli 



with normalized average power P). = 1 . We assume that the empirical distribution of the P k tends 
to a distribution and denote the corresponding expectation by JEp[— ]. 

The interpolation method can be applied as before. We interpolate between the true communication 
system and a decoupled one where 

Vk = ^/PkXk + ~^ w k 

Let V denote the diagonal matrix Pk&kw ■ The relevant posterior measure replacing ([29]) is now 

Pt,u(x\n,w,h,s) = ^ cxp( ~hn- N-^ B{t)hVr{x° ~ x)\\ 2 (73) 



where X(t) and B{t) are related as in ([23]) . The whole analysis can again be performed in exactly 
the same manner with the proviso that the correct "order parameters" are now mj = Aj] Pk x k an d 

<Zi2 = jr PkX^ ■ One nnds in place of 



J Dzki(2cosh(VP\z + PX)) 
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A, 



2^M1+ 0B(l-m))+J 



R(t)dt 



where R(t) has the same form as before but the with new definition of mi. From the positivity of R(t) 
we deduce the upper bound (f2Tjl on the capacity with crs(to) replaced by 



-E, 



J Dz ln(2 cosh(\/PAz + PA) 



— (1 + to) In Act 

2 V ' 2/3 



7.2 Colored Noise 

Now consider the scenario where 



K 

E 

fe=i 



with colored noise of finite memory. More precisely we assume that the the covariance matrix E[nj7ij] = 
C(i,j) (depends on \i — j\) is circulant as N — > +oo and has well defined (real) Fourier transform 
(the noise spectrum) C{ui). The covariance matrix is real symmetric and thus can be diagonalized by 
an orthogonal matrix: Y = OCO T with OO t = T — I. As N — > +oo the eigenvalues are well 
approximated by j n = C(2tt^). Multiplying the received signal by T~ 1 / 2 the input-output relation 
becomes 

A 1 

fe=l 
where 

y\ = (T-^Og),, n> = (T-^Onh 
The new noise vector n' is white with unit variance, but the spreading matrix is now correlated with 



K 

^= / Hk^k 

N ^ 



(74) 



One may guess that this time the interpolation is done between the true system and the decoupled 
channels 

1 

Vk =x k + — — w k 



where this time 



A C oi — 



271 



du) 



D 



o 2tt C(uj)+ j3B(l-m) 



Note that C{w) = 1 when the noise is white and we get back the A defined in (fTTj) . The interpolating 
system has the same posterior as in (|29|) but with X co i(t) and B(t) related by 



2tt 



dw 



B(t) 



o 2 7 rC(a;)+/3B(t)(l-m) 



+ Koi{t) = 



271 



du 



B 



o 2tt C(lu) + j3B(l - to) 



The only difference in the subsequent analysis is in the algebraic manipulations for the term Ti in section 
13.4.21 Indeed these require integrations by parts with respect to the spreading sequence which involve 
([74")) . The analog of (|4"Tj) now becomes 



1 

T2 = -T7 



N 



B'(t)E(l- mi ) t 



N ^ 2( 7n 

n— 1 v 



du> 



•/3B(t)E(l-mi) t ) 
B'(t)E{l- mi )t 



(75) 



/o 2tt 2(S(w) + #B(t)E<l - mi}*) 
This finally leads to the bound on capacity with crs(to) replaced by, 



do; 



■ y Dz In 2 cosh(v / A co /z + A co /) + ^-(1 + to) 



2tt 



do; 



^ ^ln 



C(u) 



2/3 J 2tt C(w) + 0(1 - to) 
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7.3 Gaussian Input 



The interpolation method also works for non binary inputs. Here we consider the simplest case of 
Gaussian inputs with distribution (|13|) (which achieves the maximum of the mutual information for any 
symmetric s^). Here we outline the necessary changes in the analysis. 

The interpolation is done as explained in section |2~^1 except that is multiplied by the Gaussian 
distribution (fT"3|) . In ((26)) we also have to include this Gaussian factor and the sum over Xq is replaced 
by an integral. Then as in section l3~Tl we do the change of variables y — > Bit)^ 1 ! 2 + N^^-^sxP and 
y — ► A(i) _1 / 2 w + x°. The posterior measure used for the interpolation becomes 

Pt,u{x\n,w,h,s) = J— exp( -hn- N-i B{t)is(x° - x)\\ 2 (76) 

, \ -Mil 

- 2 Ik - A#(x° - x)\\ 2 + h u (x)j j-^ 

and we have to compute lim^^+oo lim u ^o¥.[fi tU (n,w,h, s, £ )]. The main difference is that now the 
expectation E is also with respect to the Gaussian vector x . The algebra is done as in section [3] except 
that is not set to one, Z]~ is replaced by x\ — Xk and the correct order parameters are m\ = -4 Yl x t x k 

and q 12 = yE4 1} 4 2) - 

The interpolation method then yields in place of (|48f 

A Z" 1 _ 

+ -(l-m)+ / R(t)dt + 0{yfu) 

* Jo 

where i?(t) is the same function as before but with new definition of mi. Again the positivity of R(t) 
implies that the replica solution is an upper bound to the capacity. 



8 Concluding remarks 

In this contribution we have shown that the capacity of binary input CDMA system with random 
spreading is upper bounded by the formula conjectured by Tanaka using replica method. The approach 
we follow is by developing an interpolation method for this system. This idea has its origins in statistical 
mechanics and has been applied to Gaussian energy models. The current system is very much different 
from those models and the proof we develop is also significantly different. In fact this model is closer to 
the Hopfield model for neural networks, for which the interpolation method is still an open problem. 

We also show that the capacity and the free energy functions concentrate around their average in 
the large-system limit. In addition we prove a weak concentration for the magnetization for a system 
which is slightly perturbed using a Gaussian field. It might be interesting to show a similar result for the 
CDMA system itself which has some implications towards proving the concentration of the BER. We 
also show the independence of the capacity from the spreading sequence distributions in the large-system 
limit. 

We expect that the powerful probabilistic tools used here have applications for other similar situations 
in communication systems. We have shown some of the extensions here but there are many other cases 
like constellations other than binary, CDMA with LDPC coded communication to name a few, to which 
this method can be applied. In all these cases we can prove an upper bound on the capacity. The most 
interesting and also important open problem is to prove the lower bound. This seems to be a difficult 
problem and again the standard techniques fail. Other important problems are proving the conjectures 
related to the BER of various decoders. 



A Relation between capacity and free energy 

Replacing ([3]) in the conditional entropy 



H(X\Y) =-Ey 



^2p(x\y,s)lnp(x\y,s) 
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^p(x]y,s)\nZ(y,s) + Ey ^2p(x\y,s)^\\N hx-y 



(77) 



- Ey P(% I Vj S ) ln Px(x)} 



The first term on the r.h.s is equal to Ey[lnZ(y,s)] because ^2 x p{x \ y, s) = 1. The second term on the 
r.h.s can be computed exactly. Indeed, 



Ey 



J2p{x\y,s)^\\N hx-y\\ 



dy 



z(y,s) 



^p{x\y,s)^\\N »sx-yj\ 



= ^2,Px_{x) i dy 



1 



x 7T-2\\ N 2s x-y 



1 

N _ K 
~2 ~ 2/3 



2 



. e -^W N ~^ s ^-y\\ 



A similar calculation shows that the third term is equal to H(X_). Therefore the relation between 
Shannon's conditional entropy and the free energy is 

H (X I Y) - % [In Z(y, s)] + ^ + H(X) 



This is equivalent to the announced relation ©. 



B Probabilistic tools 

Our proofs rely on a general concentration theorem for suitable Lipschitz functions of many Gaussian 
random variables [24j . [S] and this is why we need Gaussian signature sequences. In the version that we 
use here we need functions that are Lipschitz with respect to the Euclidean distance. More precisely we 
say that a function / : R M — > R is a Lipschitz function with constant Lm if for all (u, v) £ R M x R M 

\f(u)-f(v)\<L M \\u-v\\ 

When another distance is used the function will still be Lipschitz but one has to carefully keep track of 
the possibly qualitatively different M dependence. 

Theorem 8. '24-1 Let (Ui, ...,Um) — U_ be M independent identically distributed Gaussian random 
variables with distribution Af(0, v 2 ) and let f : R M — > R be Lipschitz with respect to the Euclidean 
distance, with constant Lm- Then f satisfies 

P[|/fe)-E[/(w)]| >i]<2e -i ^% 

In our application it will not be possible to apply directly this theorem because the relevant function 
is Lipschitz only on a subset G C R A/ . It turns out that the measure of the complement G c is negligible 
as M — > +00. For the "good part" of the function supported on G we will use the following result of 
McShane and Whitney 

Theorem 9. '251 Let / : G — » R, be Lipschitz over G C R M with constant Lm- Then there exists an 
extension g : R — * R such that g\c = f which is Lipschitz with the same constant over the whole of 
R M . 

From these two theorems we can prove the following. 
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Lemma 7. Let f and g be as in theorem^ Assume £ G and E[/(u) 2 ] < C 2 , f(0) 2 < C 2 for some 
positive number C . Then for 

~ > 3(C + «VM)VP(G C ) 

we Ziave 

P[|/(u) - E[/(u)]| > t] < 2e 8 " 2i -m + P[G C ] 

Proof. We drop the u dependence to lighten the notation. Notice that £ G implies /(0) = 5(0). Thus 
g(Q) 2 < C 2 . Also, since g is Lipschitz on the whole of K M 

% 2 ]<2(. 9 (0) 2 +E[(.g-. 9 (0)) 2 ]) 

<2(C 2 + L M E[||u 2 ||) 

= 2(C* 2 + Mv 2 L M ) 

Furthermore on G we have g — /, so by the Cauchy-Schwartz inequality 

\E[g-f]\ = \E[(g-f)l G c}\ 

< (E[ ff 2 ] 1 / 2 +E[/ 2 ] 1 / 2 )VP[G^J 
<(C + V2(C 2 + Mv 2 L M ) 1/2 W^[G c ] 

< 3(C + v\/ML M )v / P[G s ] < | 

Moreover 

P[|/ - E/| > t] = P[|.g - E/| > t 1 U £ G]¥[G] 

+ P[|/-E/| > t \Ue G C ]P[G C ] 
<P[|fl-%| >i-|E 5 -E/|]+P[G c ] 
The result of the lemma then follows from 

F[\g -Eg\>t- \Eg - Ef\] < P[\g - Eg\ > 

and the application of theorem [H] □ 

In order to prove Theorems [T] and [2] it will be sufficient to find suitable sets G with measure nearly 
equal to one (as M — > +oo), on which the capacity and free energy have a Lipschitz constant Lm — > 0. 

C Proofs of Theorems [I] and [2] 

For the proofs, it is convenient to reformulate the statements of the theorems as follows. Let 1 be 
the K dimensional vector (1, 1), s° be the K x N matrix with elements SikX^. We set p x (x) — 

Ylu—iPx(xkxT) and consider the partition function 

Z'(n,s°) = J2p x^ e ~^ llN ' 1/2s °^~ k) ~' T - r ( 78 ) 

X 

where we recall that n = (m, ...,njv) are independent Gaussian variables 7V"(0, 1). Notice that due to 
the invariance of the distribution of Sik under the transformation Sik — > x° k Sik, 

Ew, s [lnZ'(n,s )] = E K , s [\nZ' {n,s)\ 

The statements of Theorems Q] and [2] are equivalent to 

¥[\^px(x )EN[^Z\n,s°)]-E Ki s[^Z'(n,s)]\ > tK] < 3e~ Ql * 27V (79) 

a; 

and 

PEwfe°)l lllZ 'fe s °)-%,s[ln2'fes)]| > tK] < 3e-" 2t2 ^ (80) 

X? 

To see this use the change of variable y = ./V _1 ' 2 S£g+crn followed by Xk — > ^fc^/J in the partition function 
summation ([4|. 
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C.l Proof of flZSD 

Let 5 be a positive constant to be chosen later and define 

G = {s | for all x,x°, \\s°(x - 1)|| 2 < BN} 

Lemma 8. We have the following estimate for the measure of G c , 

P(G C ) < 3 K 2* e -w 

Proof. First notice that for any given x, 

K 



V K fe=i 



i = l,...,N 



are independent Gaussian random variables with zero mean and variance (a 2 ) smaller than 4. Thus the 
identity 

— x ^ o 

f e 2^ » 2 , a \-i 
/ , e 16 = ( 1 ) 2 

implies (because a 2 < 4) 

E [ e T^I| S °(--I)l| 2 ] < 2 f 

Then from the Markov inequality, for any x 

P(||s °(x- 1)|| 2 > BN) < 2~e~^ = 2~e"W 
The result of the lemma then follows from the union bound over 3 K possible x — x vectors. □ 

We will apply Lemma [8] to 

/(s) = ^EP2Lfe°)%[^^fes )] 

X? 

for a suitable choice of B. In the application the matrix s is to be thought as a vector with KN 
components and norm 



s = 



N K 

i=i fc=i 



Clearly € G and /(0) 2 = (^Ejv[i||n|| 2 ]) 2 = 1/4/3 2 . Also it is evident that lnZ'(n,s°) < 0. On the 
other hand restricting the sum in the partition function to x — 1 we have 

^ E^°) E iv[ln^fes )] > -^E K [a 2 \\n\\ 2 } - ±H(X) >~- In 2 

x° 

Therefore we have 

Es[/(s) 2 ]<(^+ln2) 2 
Let us now compute the Lipschitz constant. 
Lemma 9. -K" EivEx Px(x°) In Z'{n, s )] is Lipschitz on G, with constant 

L N = a^y^R-^VB + VNo-) 
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Proof. The exponent of the partition function is0 

H(n,s°,x) = J-\\N- x / 2 s°(x- 1) - an\\ 2 (81) 
2a l 

In the section [CL4l we show that for (s, t) G G x G 

\H{n,s\x)-H{n^\x)\<a- 2 2^{^+\\n\\)\\s-t\\ (82) 
Using this inequality together with 

-H{n,s°,x) < ~H{n,t°,x) + \H(n,8°,x) - i?(n,t , £ )| 

we have for (s, t) € G X G 

£xPj(g)e*P(-g(&s ,3Q) 
n E,Pife)exp(-H(n,t°, ; E)) 

Px fe) exp(|if (n, s° , s) - H (n, t° , a) | - H (n, t° , x) ) 
" ln E,Plfe)ex P (- J ff(li,t ,a;)) 

< o-- 2 2yff3(V~B + ||n||)||s-t|| 
Therefore taking the expectation over the noise, we get 

| Px(£°)Ejv[ln Z'(n, s )] - £ Px _(z )Ejv[ln Z'(n, t )] | 

<a- 2 2y^/S + aE[||n||])||s-t|| 
< <J- 2 2^(VB + o-E[\\nf] 1/2 )\\s - t|| 

which yields the Lipschitz constant of the lemma. □ 

Finally (79|) follows from Lemmas H M and M with the choice B = 32(3(2K + N). We obtain a\ = 
1/(8KL%) > o- 4 /(16/3(64/3 + 32 + a 2 )). 

C.2 Proof of flSOD 

This case is more cumbersome but the ideas are the same. We choose the set G as 
G = js,?! | max|n 2 | < \[~A and for all x, \\s°(x- 1)|| 2 < BN^ 

where, as before A and B will be chosen appropriately later on. For Gaussian noise P[|n;| > \f~A\ < 4e~T 
therefore from the union bound P(maxj InJ > VA) < ANe~^ . Using Lemma |8] we obtain an estimate 
for the measure of G c , 

P[G C ] < 4iVe-T + 2 K +f e"W 



The goal is to apply Lemma [7] to f(n,s) = lnZ'(n, s°) defined on 



K x R NK . 



Clearly (0,0) £ G, /(0,0) = ln2 and by the same argument as before we have E[/(n, s) 2 ] < (^ + 
ln2) 2 = C 2 . It remains to compute the Lipschitz constant. 

Lemma 10. The free energy K~ l \nZ'{n, s°) is Lipschitz on G with constant 

L N = a~ 2 {2^ + o)K- 1 {o-yf~NA + Vb) 



a Hamiltonian 



2G 



Proof. For the same Hamiltonian (|8ip we show in section [CT3l 

\H{n,s°,x) - H(n,t°,x)\ 

< <j- 2 2(2 yfp + o-)(aVNA + VI) || (n, s) - (m, t)|| (83) 
Then proceeding in the same way as in the proof of Lemma [5] we get 

|lnZ'(n,s°) -lnZ'(m,t°)| 

< a- 2 (2y/f3 + o-)(o-VNA + VB)\\{n,s) - (m,t)|| 

□ 

We can now conclude the proof of ([50)) by collecting the previous results and choosing A = y/~N / a 2 
and B = 32[3(K + N). This gives a 2 = l/(8y/KL 2 N ) > a A (3 i / (32(2 y/(3 + a) 2 ). 

C.3 Proof of ([83]) 

Let n, m be two noise realizations and s, t two spreading sequences all belonging to the appropriate set 
G. Let y = x — 1. First we expand the Euclidean norms 

\\N-h°y - an\\ 2 - ||JV - 't y - am\\ 2 

= a 2 ||lil| 2 - ( T 2 |N| 2 + A r - 1 (llA|| 2 -||t I || 2 ) 

- 2a N~ 5 (n* • s°y - m* • t°y) 
= cr 2 (n-m)* • (n + m) +iV" 1 (s°y-t°£) t • (s°y + t°y) 

-2aN~i(n-m) t ■ s°y - 2aN^im t ■ (s°y-t°y) 

We estimate each of the four terms on the right hand side of the last equality. By Cauchy-Schwartz the 
first term is bounded by 

||n — m|| ||n + m|| < \/iVmaxj(|nj| + |mi|)||n — m|| 
< 2VNA\\n - m|| 

Using Cauchy-Schwartz and ||(s -t°)y|| < ||s n -t°|| ||y|| where ||s°-t°|| = ||s-t|| is the (Hilbert-Schmidt) 
norm, 

/ N K .1/2 

lis— *ii = EE(^-^) 2 

\=1 1=1 ' 

we obtain for the second term the estimate 

N-^\\s - t||||y||(||s° E || + ||t°y||) < N-% t\\2^K2^BN 

= 4V^B||s-t|| 

Similarly the third term is bounded by, 

2iV~s||n-m||||s°j/|| < 2N~^ \\n - m\\VBN 
= 2^/B\\n-m\\ 

and the fourth one by 

2iV"i||m||||s-t|||y < 2N~^VNA\\s-t\\2yfK 
= 4v^VA||s-t|| 

Collecting all four estimates we obtain 

||7V-5 S 0(a; - 1) - an\\ 2 - \\N~h (x - 1) - am\\ 2 

< 2a(aVNA + VB)\\n-m\\ + A^(aVNA + Vb)\\s - 1|| 

< 2(2y/0 + o)(trVNA + VB)\\(n, s) - (m,t)|| 
where the last norm is the Euclidean norm in R w x 1™. 
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C.4 Proof of (182 



Let s and t be two spreading sequences both belonging to the appropriate G. Let y = x — 1. Following 
similar steps as in the previous paragraph with n = m the result can be read off 

\\N-h°y- an\\ 2 - \\N-h°y- an\\ 2 

<4 v /^(v / S + a||n||)||s-t|| 

D Proof of Theorem [3] 

The idea of this proof is based on p2"12] . [23] . 

Proof. Here, for simplicity of notation and without loss of generality, we assume the noise variance to 
be 1 and the second and fourth moments of spreading sequences to be less than 1. For I < K, let (f>i be 
the sigma algebra generated by {sik -l<i<N,l<k<l}. and set 

/j=E[lQf;r)|0i], 4>i = fi-fi-x 

Then 

K 



E(I(X;Y) ~ ni(X;Y)}) 2 = £>[i/f] 



i=i 



The goal is to bound each term in this sum by O Here we use the following form of the mutual 

information 



where, 



^ k 

i i,k i k ik 

In the above expanded form, the first two terms do not involve x and hence the concentration of these 
terms follows very easily. Therefore, in the rest of the proof we consider the Hamiltonian with only the 
remaining two terms. From now on in the notation, we do not explicitly show the dependency of H on 
x° and x. To this end we define the following three Hamiltonians. 



Hl = E Slfc l Slfc 2 ~ Xfc l ) " Xk 2 ) 

ki ,k2j^l,i 



N ^1, 



J-f ^2 S ik s il( x l ~ x l){ x k - x k) + -y=^n*SflXj 

i.k i 



H i {t) = H l +tR l 

where t £ [0, 1] will play the role of an interpolating parameter. We also introduce the difference of free 
energies associated to the Hamiltonian Hi(t) and Hi, 

fl(t) = ^2p2d x °)(^Z(H l (t)) - lnZ(J3i(0))) 
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In the last definition the partition function is defined by the usual summation over all configurations x. 
With these definitions we have the representation 

i>i = ^E> I+ i/,(l) - ^E>,/i(l) 

where E>; means expectation with respect to {sik V k > I}. Using convexity in the form of E>/ + i [//(l)] 2 < 
E>i+i[/j(l) 2 ], it follows that 

E[Vf] < ^EE>, + i/,(l) 2 + ^EE>,/,(1) 2 

-^E[(E>, +1 /,(l)|0i-i)(E>,/,(l))] 
= ^E/,(1) 2 -^E[(E> I / J (1)) 2 ] 
< ^E/ ; (l) 2 

Notice that /;(0) = and £tfi(t) > 0. Therefore, 

/ ( '(0) < .Ml) < //(l) 

and 

E[.Ml) 2 ]<E[/ ; '(0) 2 ]+E[/ ; '(l) 2 ] 

This shows that our task is reduced to a proof of E[//(0) 2 ] = 0(1), E[/j'(l) 2 ] = 0(1). This is a technical 
calculation and is given in the next lemma. □ 

Lemma 11. E[(/-'(0)) 2 ] = 0(1), E[(//(l)) 2 ] = 0(1) 
Proof. From convexity, 

(/'M) 2 <E^fe°)(^W) 

x° 

x° i 

^E^E^ 5 ^ - x ti( x °k - x ki) ) 



> 



k^l 



Hi(t) 

N " 'J / m{t) 

We will find a uniform bound for each term in the above sum over x . Let us consider a particular term 
in the above sum and set x\ — Xk = zok ■ We use the simple bound of z^ k < 4 in the following and hence 
we remove the average over a; . 



E[(./-'(0)) 2 ] < 12 

fcl,fc 2 #Z *1,*2 

+ 3E (j7 E n H n i2 S ill s i2l 



N ^ 11 12 111 i2 V^(0) 



Since H (0) does not depend on su and since they are symmetric random variables, in the above sums 
only those terms remain where su are repeated even number of times. Let J^i = s ik s u an d \\J\\ 

denote its largest singular value. Therefore, 

E[(./)(0)') 2 ] < 12 + 3E( E ^Jk 1<k2 zo kl zok 2 z 2 Ql ) + 3 
fcifc 2 



29 



< 15 + 3 x 2 4 E||J|| + 3 = 0(1) 

where we use that E|| J|| = (1 + \/ r j3) 2 . For bounding E[(/)(l)') 2 ] we use symmetry of the indices and 
take the sum over I and divide by K . Let Aij = suSji. 

I &1,&2 



ii,*2 I 

; 12 + 6 x 2 4 E|| J|| 2 + 3E[|m| -J= ^K) 2 ] 



2V 

= 12 + 96E|| J|| 2 + 3E||A|| = 0(1) 
In order to estimate E|| J|| and E||A|| one can use standard methods (see for example [26 ) □ 

E Estimates (EH and (EL 



Let — xok ~ and z^°^ denote the vector (z[ , . . . , z^). Let us split the contribution from 
T\ — Ti in to Tn + T\z corresponding to the two terms appearing in (|59|) . For Xii(£, k), we get 



1 



Tn(z,fc) = ^=E rife [(r 2 fe ~l) / E. 



II 



d 2 g lk (u) 
du 2 



du 



(84) 



where gik{u) denotes the function in (|55[) with = w. Let (-)t,i,k denote the Gibbs measure with 
= u. Let vf(i) denote the vector v^t) with replaced by u. We now show that the term inside the 
integral decays with N. 



dgik{u) 



a 2 E(z 2 ) f -E((n l+2 ;^).z) 2 z 2 ) t 



du 2a 4 KN 

+ E^n, + N-tjuit) ■ z^){m + N-i 2i {t) • 2 (2) )4 1) 4 2) ) ) ( 85 ) 



d 2 gik(u 



, , a 2 ((n i + N-iv'{(t)-z)3z 3 k Vi) 

+ 3a 2 ((n l +iV-^(t).z( 2 ))(z( 1) ) 2 4 2) ^ 



t.i.k 



t.i.k 



+ ((n t + N~h^(t)-z) 3 zlVi) 

- 3((n< + JV-^(t) • z (1) ) 2 (n 4 + JV~ • ^{^fz^y/tj 
+ 2(n« = i A a(n i + AT *«*(t) ' l (a) )4 1} 4 2) 4 3) ^) ) (86) 

\ / t,i,k/ 

The Hamiltonians corresponding to (.)t and (-)t,i,k axe 

= ll« + ^^v(t)z|| 2 , = -^2 II" + A^-*v i)fe (%|| 2 



where Vj ; fc(t) differs from v(t) only in the (i, fc)th entry with m replacing r^. Expanding 



H it k(z} = -^2(nj + N ^Vj- z) 2 -(rii + N i^vuzi + N Vl - is, 

U 2 tZ? U\ftZk , 1 » 1 ; 

-± + —7=-{n t +N *2^vuzt+N ^Vl^isikZk) 



kZk) 2 
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Let the sum of the first two terms be denoted as H' ik (z) and the terms involving u be H" k (z). Consider 
the following set 



G = {n,r, S :Vi ^=K , ^ 



For sufficiently large C we have P(G C ) = 0(e aN ) for some constant a > 0. If (n, s, r) G G, then for all 
ze{0,2} K 



\H'l k {z)\<^f + 2\u\C^C'{u). 



(87) 



Therefore for the first term in the equation 

^Mth+N-^-z)) 

\ I t,i,k 



o 



fVA\ 



£ <^ ^i^w 1,0, < « 

+ E(\n l + N-h l -z\l {G c }i 

\ I t,i,k 

The expectation over G c can be bounded as 0(e~ aN )0(\u\). Therefore the last two terms contribute 
0(Ji=L). For the first term after we have removed the terms with u dependence, the Hamiltonian H' ik 
satisfies Nishimori symmetry. Therefore we get the first term to be equal to, 



E s 



.J J- e 2 % 1 J2e- H ^\n i + N~^vf ■ z- uVtN'^ z k \ dn 



Note that the above integral is a Gaussian integral and can be evaluated easily. Using similar method, 
we can show that 



d 2 g ik (u) 



3C'( 



du 2 

The exponent 3 is due the occurrence of 3 replicas in the equation (|86[) . Therefore, 

r d 2 g ik (u) 



E,. 



(rtk ~ 1) 







du 2 



du 



<E r 



.C'M 



4k \ N-i{0{l)e 6 ^- +0(N-2\u\ & ))du 



(89) 



<0{N-i) 

where we have used the assumption A for the distribution of r^. Now summing this over all i, k we get 

\T n \ < 0(N~i) (90) 
Now consider the term T13. For this we have to evaluate the following term. 

d 3 g l k(u) _ t 



On* ~ 2a*KN 2 V ■ 3 ^ E ^)t + 6a 2 E((n l + N^vM ■ *) a *2) t 
- 12«7 2 E(n a=1 , 2 (n. i + N-ivS) ■ l ( ° ) )(4 1) ) 3 4 2) ) t 
+ 3a 4 E((zi 1) ) 2 (4 2) ) 2 ) t " ^E(( ni + N-iju® ■ ^WW^f 
+ 9a 2 E(n a=2 , 3 (n l + iV-^ i (t).z( a ))(z( 1) ) 2 4 2) 4 3) ) t 
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-E^n i + N-iv i (t)-z) i zi) t 

+ 4E({m + N-ijuit) • z (1) ) 3 (n* + ^"^(t) • z^)(z[ 1} ) 3 zf ^ 
+ 3E(n a=1>2 (n l + Ar*£,(i) • ^ (a) ) 2 (4 1} ) 2 (4 2) ) 2 ) t 
- 12E/(n< + N-h^t) • z (1) ) 2 (n 4 + N-*m(t) ■ z^)(m + ■ ^{z^fz^ z { ^\ 

+ 6E(n a=1)2i 3,4K + iv-^ i (t).zW)4 1) 4 2) 4 3) 4 4) ) t ) 

We can prove along similar lines that |Xi2 1 < 0(N^ 1 ). 



F Nishimori Identities 

Proof of LemmaUl We only give a brief sketch because the method is standard (see for example [2"TIl2"g] ). 
One writes fully explicitly the expression for F t mi (x) and performs the gauge transformation x k — > x®Xk, 
Sik — * x^Sik where x° is an arbitrary binary sequence. Since PL (x) does not depend on x° we sum over 
all such 2 K sequences and obtain a lengthy expression. Exactly the same procedure is applied to P* 12 (x) 
and one gets another lengthy expression. Then one can recognize that these two expressions are the same. 

Proof of Lemma [H 

Proof of (|39[) . We will prove it for t = 1 and for general t it is similar. Let the transmitted sequence 
be the all one sequence, and the received vector be r = an + J -^sl where m ~ A/"(0, 1). The proof 



follows by using gauge symmetry. Let u denote the K dimensional vector (u, . . . , it) 

1 Hi-^ii 2 _ 



nm\\ 2 )i,u 



E s 

E s 
1 

2^ 



(2nu)f (2na 2 Y 
1 

(2 7 rtt)f(2 7 ra 2 ) i 
1 



e 2 » 2 



\\r-N-?s\\ 



\Z\\ )i u dr_ dh 



Hi " 



e 2 



iylll iV 2 Sx|| 2 +/t-a 



-c?r c?/i 



Ex' 



' iV 7 S K|| 2 + h-: 



E 



(27ru) J 2 t (27r<T 2 )^ 

^ e "^l\\L-N'hx\\ 2 + h-x^ 2 ^2 



- -iy 1 1 r - JV " 2 si 1 1 2 + h . x° 



111 — iV 2"s:r|j 2 +h-:E 



(91) 



= N 



(|9ip is obtained by performing the gauge transformation Xk —> x^x^, Sik — * Sifcx" and — > hkxl and 
summing over all the 2 K possibilities of x°. Now canceling the summation over x° with the denominator 
and then integrating we get it to be equal to N. 

Proof of (g0|). The proof is complete if we show E[((n • Z (2) )fe (1) ■ ^ (2) ))t,J = 0. We will prove this 
for t — 1 and it is similar for other t. 

E[((n-Z^)(x^-z^)) t , u ] 

= E[<(r< - Art J2 s «)(^ - ^ E s «4 2) )(4 1} - 4 1) 4 2) ))*,J 

Now performing the gauge transformation x^ — ► 4 ^fci 4^ — > x^x^, — > SikX® and ftfe /ifcX° we 
get 

m(n - N-i ]T wDfo - N ^ E sux^ix^xl - 4 1) 4 2) )>*,«] 



This quantity can be shown to be equal to by noticing that the x° and xj 2 ^ play symmetric roles. 



32 



G Proof of inequality ( 1411) 



For a given configuration of z, —i== s^z; = is a Gaussian random variable with mean and variance 
smaller than 4. Thus for rij ~ A/"(0, 1) and independent of Z, 



= E[e^^] < 



a' 2 - 4 



If a > 2, we have both the expectations to be less than some constant C > 1. Therefore for any z 
Using the Markov inequality, 

p (|^5>i-7=X>H >y N ) <ic N e-y N 



Using the union bound over z, for y large enough there exists a constant 7 > such that 



^i^ik^k 



i,k 



> ay) < 2 



>- 7 iV 



Let G be the event that 



]y572 J2i k n i s ik z k > cty holds for all z. Splitting the expectation into two parts 
corresponding to G and G c and using Cauchy-Schwartz inequality, we have 



E 



— Y 

N 3/2 

<aV + V^)K(^E 

i,l 

<a 2 y 2 + 0(2-i N ) 
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